feat(quotas): Enforce rate limits on metrics buckets [INGEST-1654] #1515

jjbayer · 2022-10-05T15:05:49Z

Apply rate limits to metrics buckets from the transactions namespace.

Background

Dynamic Sampling introduces a new type of quota, transactions_processed, which defines how many transactions should be "processed", i.e. whether or not we should extract metrics from them. This quota is always >= the transactions quota, which defines how many transactions should be stored & indexed.

The new data category should not only apply to incoming transaction payloads (see #1517), but also to metrics buckets, which may have been extracted by an upstream Relay.

Moreover, processing Relays must count the number of transactions for which metrics have been dropped, and create outcomes for them, in the same way that we create accepted outcomes (see getsentry/sentry#39236).

Data Flow

Processing Relays check rate limits against Redis. Currently the only actor that has access to this rate limiter is the EnvelopeProcessor, and we want to make use of its thread pool to make the blocking calls to Redis. This makes it necessary to redirect metrics buckets to the processor before sending them to the EnvelopeManager for publishing:

Old Flow (still applies under certain conditions)

 ┌──────────┐    ┌────────────┐ ┌───────────────┐
 │Aggregator│    │ProjectCache│ │EnvelopeManager│
 └────┬─────┘    └─────┬──────┘ └──────┬────────┘
      │                │               │
      │  FlushBuckets  │               │
      ├───────────────►│               │
      │                │  SendMetrics  │
      │                ├──────────────►│
      │                │               │

New Flow

 ┌──────────┐  ┌────────────┐       ┌─────────────────┐ ┌─────┐ ┌───────────────┐
 │Aggregator│  │ProjectCache│       │EnvelopeProcessor│ │Redis│ │EnvelopeManager│
 └────┬─────┘  └─────┬──────┘       └────────┬────────┘ └──┬──┘ └──────┬────────┘
      │              │                       │             │           │
      │ FlushBuckets │                       │             │           │
      ├─────────────►│                       │             │           │
      │              │ RateLimitFlushBuckets │             │           │
      │              ├──────────────────────►│             │           │
      │              │                       ├────────────►│           │
      │              │                       │             │           │
      │              │                       │◄────────────┤           │
      │              │                       │             │           │
      │              │                       │             │           │
      │              │                       │       SendMetrics       │
      │              │                       ├─────────────┬──────────►│
      │              │                       │             │           │

Business Logic

ProjectCache received a FlushBuckets from the metrics aggregator.
Count the number of transactions that contributed to these buckets (see below).
Check if cached rate limits apply.
If so, drop transaction-related buckets, generate outcomes, and send the buckets to EnvelopeManager.
Otherwise, if processing mode is enabled, send buckets to the EnvelopeProcessor.
The processor communicates the transaction count to redis and applies the rate limit if reached.
The processor sends a message to the project cache to update the cached rate limits.
Drop transaction-related buckets, generate outcomes, and send the buckets to EnvelopeManager.

Counting processed transactions

For d:transactions/duration@millisecond, increment the counter in redis by the length of the bucket's value. This is an accurate count of the number of transactions that contributed to the bucket.
For any other metric, do not increment the counter in redis, but still enforce the rate limit if the quota is currently exceeded.

Not in this PR

Enforcement of rate limits on the fast path.
Enforcements of rate limits on metrics buckets in the sessions namespace.

.

For #1515, it is required to check for a required quota of another data category without incrementing it. This PR updates the Redis LUA script to support a rate limiting quantity of `0`, which checks for existing rate limits without incrementing internal counters. The rate limiter gains a new explicit branch to check whether the quantity is `0`. Co-authored-by: Jan Michael Auer <mail@jauer.org>

* master: fix(quotas): Make redis rate limiter work with quantity 0 (#1519) ref: Remove unused rate_limits from ProcessEnvelopeState (#1516) fix(quotas): Use correct string spelling for TransactionProcessed (#1514)

…tion

…e-quotas

tests/integration/test_store.py

…e-quotas

relay-server/src/actors/project_cache.rs

jjbayer · 2022-10-13T08:32:34Z

relay-server/src/actors/project_cache.rs

+        match rate_limits {
+            Ok(rate_limits) => {
+                // If a rate limit is active, discard transaction buckets.
+                if let Some(limit) = rate_limits.limit_for(DataCategory::TransactionProcessed) {


Is it even necessary to filter by data category here, given the item_scoping? Should we use rate_limits.longest() instead?

…e-quotas

jjbayer · 2022-10-14T11:08:23Z

relay-server/src/actors/project.rs

@@ -601,6 +606,7 @@ impl Project {
    ///
    /// The buckets will be keyed underneath this project key.
    pub fn merge_buckets(&mut self, buckets: Vec<Bucket>) {
+        // TODO: rate limits


We will add a call to BucketLimiter::enforce_limits() to metrics_allowed() in a follow-up PR.

olksdr

I have only one question/suggestion, otherwise lgtm!

olksdr · 2022-10-17T10:01:55Z

relay-server/src/utils/metrics_rate_limits.rs

+    // outcome is generated.
+    //
+    // Returns true if any buckets were dropped.
+    pub fn enforce_limits(&mut self, rate_limits: Result<&RateLimits, ()>) -> bool {


To accept the result into the function seems a little bit strange and we kind of ignore the Err anyway.
Could we just accept Option<&RateLimits> instead? In my opinion it will make the interface a little bit cleaner.

I see your point, given that the Err variant is empty anyway. But I feel that using a Result instead of an Option makes it clear to the reader that this is an error case.

When I look into this, first comes to my mind that all the errors should have been handled before calling this function, and it should get here only if we are ok with enforcing the limits (whatever it means in this case).

If we have Option, in case of None it says to me - there is no limits to enforce, just emit outcome, otherwise enforce what you can 😄 once there is Some(_)

…e-quotas

### Background In #1515 we implemented rate limiting functionality for metrics buckets, which are applied after flushing them from the metrics aggregator by checking redis. ### This PR When a quota is already exhausted, and that information has already been cached on the project state, there is no need to go through the aggregator at all. Instead, we apply the rate limiting logic _before_ sending metrics or buckets into the aggregator. ### Implementation details The utility type `BucketLimiter` was converted to a generic `MetricsLimiter<T>`, where `T` can be either `Metric` or `Bucket`. ### Notes Will be merged into #1537 & deployed together.

* master: release: 0.8.15 fix(py): Respect the renormalize flag (#1548) (fix)e2e: Use report self hosted issues env variable (#1539) meta(vscode): Enable all features in Rust-Analyzer (#1542) release: 0.8.14 build(craft): Fix manylinux artifact name (#1547) feat(quotas): New data category for indexed transactions (#1535) test(auth): Unflake re_auth_failure (#1531) replays: add warning log for parse errors (#1534) fix(server): Retain valid cached project states on error (#1426) feat(protocol): Implement response context schema (#1529) feat(replays): emit org_id on recording kafka messages (#1528) feat: Add .NET/Portable-PDB specific protocol fields (#1518) feat(quotas): Enforce rate limits on metrics buckets (#1515) ref(pii): Consider all token as sensitive [INGEST-1550] (#1527) release: 22.10.0

jjbayer added 4 commits October 5, 2022 15:59

ref: Pass metrics buckets through processor

c7d3f5c

wip

90fb99e

test: Basic integration test

e1a0b22

test: integration test still fails

b6070aa

jjbayer mentioned this pull request Oct 6, 2022

fix(quotas): Make redis rate limiter work with quantity 0 [INGEST-1656] #1519

Merged

jan-auer added 2 commits October 7, 2022 15:07

Merge branch 'master' into feat/metrics-enforce-quotas

b7bd8e0

* master: fix(quotas): Make redis rate limiter work with quantity 0 (#1519) ref: Remove unused rate_limits from ProcessEnvelopeState (#1516) fix(quotas): Use correct string spelling for TransactionProcessed (#1514)

fix: Import error on default features

99ecb74

jjbayer self-assigned this Oct 10, 2022

jjbayer added 21 commits October 10, 2022 13:28

test: Fix integration test to match rate limiter behavior

71c20a3

doc: Add TODOs

86ff9f6

test: Parametrize test to cover exact exhaustion vs. exceeding exhaus…

10d9dcf

…tion

ref: Call rate limiter only once for a batch of buckets

ca12d23

feat: Emit outcome for rate limited metrics

699d0cb

test: Check outcomes were produced

3f5bbb0

Merge remote-tracking branch 'origin/master' into feat/metrics-enforc…

040c012

…e-quotas

doc: changelog

3c43037

ref: clippy

6f3f889

wip: Move logic from processor to project_cache

b9bcd98

wip

308a066

fix: Get async fn working

afc84a4

fix: Restore sending of buckets

af5b358

Merge remote-tracking branch 'origin/master' into feat/metrics-enforc…

c48a8c8

…e-quotas

ref: function name, docs, respect category arg

4061ad5

ref: Replace async block by plain function call

e8fa434

ref: Track internal outcomes for errors

9302fac

fix: Actually check rate limits

e66b46d

fix: Don't loop over rate limits

3639335

ref: Add helper to RateLimits

650f0f4

ref: Update some comments

1cabbb1

jjbayer added 2 commits October 12, 2022 16:49

fix: lint

7494af0

fix: Drop buckets on internal error

00331f3

jjbayer commented Oct 12, 2022

View reviewed changes

tests/integration/test_store.py Show resolved Hide resolved

jjbayer marked this pull request as ready for review October 12, 2022 15:17

jjbayer requested review from a team, olksdr and jan-auer October 12, 2022 15:17

jjbayer added 2 commits October 13, 2022 09:48

Merge remote-tracking branch 'origin/master' into feat/metrics-enforc…

75c4925

…e-quotas

fix: Call rate limiter with over_accept_once

157ce04

jan-auer changed the title ~~Enforce rate limits on metrics buckets [INGEST-1654]~~ feat(quotas): Enforce rate limits on metrics buckets [INGEST-1654] Oct 13, 2022

olksdr reviewed Oct 13, 2022

View reviewed changes

jjbayer commented Oct 13, 2022

View reviewed changes

flub mentioned this pull request Oct 13, 2022

feat(quotas): Always require transaction processing quota #1507

Closed

jjbayer added 5 commits October 13, 2022 17:56

ref: Move rate limiting to utils

e784b34

wip

3cf8499

fix: Send to processor

de54f9d

ref: clippy

f1bca59

Merge remote-tracking branch 'origin/master' into feat/metrics-enforc…

b3da4c0

…e-quotas

jjbayer commented Oct 14, 2022

View reviewed changes

jjbayer added 3 commits October 14, 2022 13:32

ref: self-review

d97d127

test: Try to unflake

8d5f406

test: unflake, attempt #2

aa93591

jjbayer requested a review from olksdr October 17, 2022 07:47

olksdr approved these changes Oct 17, 2022

View reviewed changes

Merge remote-tracking branch 'origin/master' into feat/metrics-enforc…

1da8498

…e-quotas

jjbayer merged commit 4ffe338 into master Oct 17, 2022

jjbayer deleted the feat/metrics-enforce-quotas branch October 17, 2022 11:57

jjbayer mentioned this pull request Oct 24, 2022

feat(quotas): Rate limit metrics before aggregator [INGEST-1655] #1540

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(quotas): Enforce rate limits on metrics buckets [INGEST-1654] #1515

feat(quotas): Enforce rate limits on metrics buckets [INGEST-1654] #1515

jjbayer commented Oct 5, 2022 •

edited

Loading

jjbayer Oct 13, 2022

jjbayer Oct 14, 2022

olksdr left a comment

olksdr Oct 17, 2022

jjbayer Oct 17, 2022

olksdr Oct 17, 2022

feat(quotas): Enforce rate limits on metrics buckets [INGEST-1654] #1515

feat(quotas): Enforce rate limits on metrics buckets [INGEST-1654] #1515

Conversation

jjbayer commented Oct 5, 2022 • edited Loading

Background

Data Flow

Old Flow (still applies under certain conditions)

New Flow

Business Logic

Counting processed transactions

Not in this PR

jjbayer Oct 13, 2022

Choose a reason for hiding this comment

jjbayer Oct 14, 2022

Choose a reason for hiding this comment

olksdr left a comment

Choose a reason for hiding this comment

olksdr Oct 17, 2022

Choose a reason for hiding this comment

jjbayer Oct 17, 2022

Choose a reason for hiding this comment

olksdr Oct 17, 2022

Choose a reason for hiding this comment

jjbayer commented Oct 5, 2022 •

edited

Loading